Hi. The next step is to understand what conjugacy means, and let's recall first that orthogonality
means, so u is orthogonal to v, means that the scalar product u-transposed times v is
equal to zero. So u and v are called orthogonal. And then the definition, I don't know the
number actually, let's call it 4.1, might be wrong, let's call it 4.1, is that u and
v are called b-orthogonal. And b has to be an spd matrix, so symmetric and positive definite.
If and only if the following is true, u-transposed b v is equal to zero. And this is also equivalent
to saying that v-transposed b times u is equal to zero. And notation is, for that is notation,
u is b-orthogonal to v, and we call u and v b-orthogonal or conjugate. Of course, if
we say conjugate, then we have to be sure that we know which matrix b we are meaning.
So in a specific setting, there will usually be a unique matrix b, and then we will just
call it conjugate. If we want to stress exactly which matrix we mean, we call it b-orthogonal.
Okay, now what does that mean? So what's the geometric idea of that? So of course orthogonality,
so the usual notion of orthogonality is if you want identity orthogonality, if b is the
identity matrix. But what if b is, for example, a matrix such that level sets of this quadratic
look like those ellipsoids. So if b is an spd matrix, then you can always draw those
ellipsoids because level sets of u-transposed b u will look like ellipsoids. And it's actually
quite easy to explain how b-orthogonal vectors look like. And for that we switch again to
this website. And so these are, let's maybe zoom a bit in here. The idea is to look at
those level sets and stretch this image such that those level sets become circles. So you
can see how we're kind of pulling here and here for as long as we need until those ellipsoids
become circles. So as you can see everything is stretched out like, so if we were to print
this on a bubblegum sheet, so to speak, you could do this in principle. And how do b-orthogonal
vectors look like? And well, it's like this. So two vectors are b-orthogonal if and only
if in the stretched version of reality they are actually orthogonal. So all those pairs
of vectors, the position actually doesn't matter. So we could have taken this blue pair
of vectors, so this vector pointing to the right, this vector pointing to the lower right.
We could move this along the space so it doesn't depend on the spatial position where we attach
them. But if we take them, we draw them on this bubblegum sheet, we apply this stretch,
and if they look orthogonal in this plot, then they are b-orthogonal. So as you can
see sometimes conjugacy or b-orthogonality is the same as orthogonality in the usual
sense. So if you take those vectors here and we stretch them, we get those vectors and
they look orthogonal and they look orthogonal. But sometimes it doesn't look like they're
orthogonal. For example, these two vectors, they have a sharp angle between them. But
stretching these two vectors gives us a right angle here, so they're orthogonal in this
picture, which means that these two vectors are b-orthogonal. Similar here, so they are
at an obtuse angle, so the angle between those two vectors is larger than 90 degrees. But
stretching this picture, those two vectors are again at right angle to each other. And
if you now look at this blue set of vectors, we can change the orientation of them and
you can see that the angle changes between obtuse like this and right angle and a sharp
angle like this. So b-orthogonality depends on the direction of those two vectors. It's
not a fixed angle between those two vectors, but it depends on the direction of one of
those vectors, what the angle has to be. In my opinion, the nicest explanation is two
vectors are b-orthogonal if stretching the domain such that lambda sets of this quadratic
form are circles, those two vectors appear to be orthogonal in this stretched domain.
So that's conjugacy. And this is kind of the right geometrical idea here. So if we had
– well, let me explain this differently. Switch back to the lecture. So another throwback
to gradient descent, if b is the identity matrix in, let's say, n times n, then the
level sets are circles, so not ellipses, not ellipsoids, they are circles or spheres, hyperspheres,
whatever. So it looks like this. This is the domain, the minimum is somewhere and then
level sets are spheres along this minimum. And if we now take any point, it doesn't
Presenters
Zugänglich über
Offener Zugang
Dauer
00:52:13 Min
Aufnahmedatum
2021-12-14
Hochgeladen am
2021-12-14 12:46:04
Sprache
en-US